Automatic Construction of Cross-Lingual Networks of Concepts from the Hong Kong SAR Police Department
نویسندگان
چکیده
The tragic event of September 11 has prompted the rapid growth of attention of national security and criminal analysis. In the national security world, very large volumes of data and information are generated and gathered. Much of this data and information written in different languages and stored in different locations may be seemingly unconnected. Therefore, cross-lingual semantic interoperability is a major challenge to generate an overview of this disparate data and information so that it can be analysed, searched. The traditional information retrieval (IR) approaches normally require a document to share some keywords with the query. In reality, the users may use some keywords that are different from what used in the documents. There are then two different term spaces, one for the users, and another for the documents. The problem can be viewed as the creation of a thesaurus. The creation of such relationships would allow the system to match queries with relevant documents, even though they contain different terms. Apart from this, terrorists and criminals may communicate through letters, e-mails and faxes in languages other than English. The translation ambiguity significantly exacerbates the retrieval problem. To facilitate cross-lingual information retrieval, a corpusbased approach uses the term co-occurrence statistics in parallel or comparable corpora to construct a statistical translation model to cross the language boundary. However, collecting parallel corpora between European language and Oriental language is not an easy task due to the unique linguistics and grammar structures of oriental languages. In this paper, the text-based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. This article then reports an algorithmic approach to generate a robust knowledge base based on statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, which can aid in semantics-based cross-lingual information management and retrieval.
منابع مشابه
Automatic crosslingual thesaurus generated from the Hong Kong SAR Police Department Web corpus for crime analysis
based approach to align English/Chinese Hong Kong Police press release documents from the Web is first presented. We also introduce an algorithmic approach to generate a robust knowledge base based on statistical correlation analysis of the semantics (knowledge) embedded in the bilingual press release corpus. The research output consisted of a thesaurus-like, semantic network knowledge base, wh...
متن کاملWhat Really Matters: Living Longer or Living Healthier; Comment on “Shanghai Rising: Health Improvements as Measured by Avoidable Mortality Since 2000”
The decline in Avoidable Mortality (AM) and increase in life expectancy in Shanghai is impressive. Gusmano and colleagues suggested that Shanghai’s improved health system has contributed significantly to this decline in AM. However, when compared to other global cities, Shanghai’s life expectancy at birth is improving as London and New York City, but has yet to surpass that of Hong Kong, Tokyo,...
متن کاملRe-usability of traffic signs for inactive drivers with consideration of personal characteristics and sign features
There has been an increasing concern about inactive drivers who would easily lead to road accidents and fatalities once return to driving. This study investigated the re-usability of traffic signs for inactive drivers with consideration of driver factors and cognitive sign features. Fifty-seven Hong Kong Chinese, who possessed a full driving license but had not driven for an extended period, co...
متن کاملConcepts of ‘self’ in delusion resolution
a Department of Psychology, University of Minnesota, USA b Department of Psychiatry, The University of Hong Kong, Hong Kong SAR, China c State Key Laboratory of Brain and Cognitive Sciences, University of Hong Kong, Hong Kong SAR, China d Psychiatry Research Group, Department of Clinical Medicine, University of Tromsø, Norway e Norwegian Centre for Integrated Care and Telemedicine (NST), Univer...
متن کاملThe Contribution of Ageing to Hospitalisation Days in Hong Kong: A Decomposition Analysis
Background Ageing has become a serious challenge in Hong Kong and globally. It has serious implications for health expenditure, which accounts for nearly 20% of overall government expenditure. Here we assess the contribution of ageing and related factors to hospitalisation days in Hong Kong. We used hospital discharge data from all publicly funded hospitals in Hong Kong between 2001 and 2012. ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003